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In this paper, we investigate the (in)-consistency of different boot- 
strap methods for constructing confidence intervals in the class of 
estimators that converge at rate n 1//3 . The Grenander estimator, the 
nonparametric maximum likelihood estimator of an unknown nonin- 
creasing density function / on [0, oo), is a prototypical example. We 
focus on this example and explore different approaches to construct- 
ing bootstrap confidence intervals for /(to), where to £ (0, oo) is an 
interior point. We find that the bootstrap estimate, when generating 
bootstrap samples from the empirical distribution function F n or its 
least concave majorant F n , does not have any weak limit in proba- 
bility. We provide a set of sufficient conditions for the consistency of 
any bootstrap method in this example and show that bootstrapping 
from a smoothed version of F n leads to strongly consistent estima- 
tors. The m out of n bootstrap method is also shown to be consistent 
while generating samples from F n and F n . 

1. Introduction. If X\,X2, ■ ■ ■ ,X n ~ / are a sample from a nonincreas- 
ing density / on [0,oo), then the Grenander estimator, the nonparametric 
maximum likelihood estimator (NPMLE) f n of / [obtained by maximizing 
the likelihood f{Xi) over all nonincreasing densities], may be described 
as follows: let F n denote the empirical distribution function (EDF) of the 
data, and F n its least concave majorant. Then the NPMLE f n is the left- 
hand derivative of F n . This result is due to Grenander (1956) and is described 
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in detail by Robertson, Wright and Dykstra (1988), pages 326-328. Prakasa 
Rao (1969) obtained the asymptotic distribution of f n , properly normalized: 
let W be a two-sided standard Brownian motion on R with W(0) = and 

C = argmax[W(s) — s 2 ]. 
If < t < oo and /'(to) / 0, then 

(1.1) n 1/3 {/„(to) - /(to)} => 2|i/(t )/'(t )| 1 / 3 C, 

where denotes convergence in distribution. There are other estimators 
that exhibit similar asymptotic properties; for example, Chernoff's (1964) 
estimator of the mode, the monotone regression estimator [Brunk (1970)], 
Rousseeuw's (1984) least median of squares estimator, and the estimator of 
the shorth [Andrews et al. (1972) and Shorack and Wellner (1986)]. The 
seminal paper by Kim and Pollard (1990) unifies n 1 ' 3 -rate of convergence 
problems in the more general M-estimation framework. Tables and a survey 
of statistical problems in which the distribution of C arises are provided by 
Groeneboom and Wellner (2001). 

The presence of nuisance parameters in the limiting distribution (1.1) 
complicates the construction of confidence intervals. Bootstrap intervals 
avoid the problem of estimating nuisance parameters and are generally re- 
liable in problems with ^/n convergence rates. See Bickel and Freedman 
(1981), Singh (1981), Shao and Tu (1995) and its references. Our aim in 
this paper is to study the consistency of bootstrap methods for the Grenan- 
der estimator with the hope that the monotone density estimation problem 
will shed light on the behavior of bootstrap methods in similar cube-root 
convergence problems. 

There has been considerable recent interest in this question. Kosorok 
(2008) show that bootstrapping from the EDF F n does not lead to a con- 
sistent estimator of the distribution of ra 1 / 3 {/ n (to) — /(*o)}- Lee and Pun 
(2006) explore m out of n bootstrapping from the empirical distribution 
function in similar nonstandard problems and prove the consistency of the 
method. Leger and MacGibbon (2006) describe conditions for a resampling 
procedure to be consistent under cube root asymptotics and assert that these 
conditions are generally not met while bootstrapping from the EDF. They 
also propose a smoothed version of the bootstrap and show its consistency 
for Chernoff's estimator of the mode. Abrevaya and Huang (2005) show that 
bootstrapping from the EDF leads to inconsistent estimators in the setup of 
Kim and Pollard (1990) and propose corrections. Politis, Romano and Wolf 
(1999) show that subsampling based confidence intervals are consistent in 
this scenario. 

Our work goes beyond that cited above as follows: we show that boot- 
strapping from the NPMLE F n also leads to inconsistent estimators, a result 
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that we found more surprising, since F n has a density. Moreover, we find 
that the bootstrap estimator, constructed from either the EDF or NPMLE, 
has no limit in probability. The finding is less than a mathematical proof, 
because one step in the argument relies on simulation; but the simulations 
make our point clearly. As described in Section 5, our findings are incon- 
sistent with some claims of Abrevaya and Huang (2005). Also, our way 
of tackling the main issues differs from that of the existing literature: we 
consider conditional distributions in more detail than Kosorok (2008), who 
deduced inconsistency from properties of unconditional distributions; we di- 
rectly appeal to the characterization of the estimators and use a continuous 
mapping principle to deduce the limiting distributions instead of using the 
"switching" argument [see Groeneboom (1985)] employed by Kosorok (2008) 
and Abrevaya and Huang (2005); and at a more technical level, we use the 
Hungarian Representation theorem whereas most of the other authors use 
empirical process techniques similar to those described by van der Vaart and 
Wellner (2000). 

Section 2 contains a uniform version of (1.1) that is used later on to study 
the consistency of different bootstrap methods and may be of independent 
interest. The main results on inconsistency are presented in Section 3. Suffi- 
cient conditions for the consistency of a bootstrap method are presented in 
Section 4 and applied to show that bootstrapping from smoothed versions 
of F n does produce consistent estimators. The m out of n bootstrapping 
procedure is investigated, when generating bootstrap samples from F n and 
F n . It is shown that both the methods lead to consistent estimators under 
mild conditions on m. In Section 5, we discuss our findings, especially the 
nonconvergence and its implications. The Appendix, provides the details of 
some arguments used in proving the main results. 

2. Uniform convergence. For the rest of the paper, F denotes a distri- 
bution function with F(0) = and a density / that is nonincreasing on 
[0,oo) and continuously differentiable near to S (0, oo) with nonzero deriva- 
tive /'(to) < 0. If g : I — > R is a bounded function, write \\g\\ := sup x£l \g(x)\. 
Next, let F n be distribution functions with F n (0) = 0, that converge weakly 
to F and, therefore, 

(2.1) lim \\F n -F\\ =0. 

n— yoo 

Let X nj i,X n: 2, • • • , X n ^ mn ~ F n , where m n < n is a nondecreasing sequence 
of integers for which m n — > oo; let F nimn denote the EDF of X n) i,X n ^, ■ ■ ■ , 
X niTnn ; and let 

A n :=my 3 {/ n/mn (t )-/n(to)}, 
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where f n ,m„(to) is the Grenander estimator computed from X n) i, X n ^, ■ ■ ■ , 
X n ,m n and f n (to) is the density of F n at to or a surrogate. Next, let I m = 
[— torn 1 / 3 , oo) and 

(2.2) Z n (h) := m^ 3 {F n , mn (i + m~ 1/3 /i) - F n>mjl (t ) - f n (to)m^ 3 h} 

for /i G X mn and observe that A n is the left-hand derivative at of the 
least concave majorant of Z n . It is fairly easy to obtain the asymptotic 
distribution of Z n . The asymptotic distribution of A n may then be obtained 
from the Continuous Mapping theorem. Stochastic processes are regarded 
as random elements in -D(R), the space of right continuous functions on R 
with left limits, equipped with the projection <7-field and the topology of 
uniform convergence on compacta. See Pollard (1984), Chapters IV and V 
for background. 

2.1. Convergence ofZ n . It is convenient to decompose Z n into the sum 
of Z n] i and Z nj 2 where 

Z nA (h) := mf 3 {(¥ n)mn - F n )(t Q + m~ l / 3 h) - (F n , mn - F n )(t )}, 

ZnM := m 2 / 3 {F n (t + m-^h) - F n (t ) - f n (t )m- l l 3 h}. 

Observe that TL n ^. depends only on F n and f n ; only Z n; i depends on X n> i, . . . , 
X n ,m n ■ Let Wi be a standard two-sided Brownian motion on R with Wi(0) = 
0, and Z 1 (h)=W 1 [f(t )h]. 

Proposition 2.1. If 

(2.3) lim mj/ 3 \F n (t + m~ l / 3 h) - F n (t ) - /(i )™n 1/3 /*| = 

n— >oo 

uniformly on compacts (in h), then Z nj i =>■ Zi; and if there is a continuous 
function Z2 for which 

(2.4) lim Z n>2 {h)=Z 2 {h) 
uniformly on compact intervals, then Z n => Z := Zi + Z2. 

PROOF. The Hungarian Embedding theorem of Komlos, Major and 
Tusnady (1975) is used. We may suppose that X n ^ = Fn(Ui), where F„ (u) = 
inf{x : F n (x) >u} and Ui,U2,-- - are i.i.d. Uniform (0, 1) random variables. 
Let U n denote the EDF of U u ...,U n , E n (t) = ^/n[V n (t) - t], and V n = 
y/m n (¥ n ^ mn — F n ). Then V n , = E m , n o F n . By Hungarian Embedding, we may 
also suppose that the probability space supports a sequence of Brownian 
Bridges {B°} n >i for which 

(2.5) sup \E n (t)-B° n (t)\=0 

0<t<l 



log(n) 
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and a standard normal random variable rj that is independent of {B^} n >i. 
Define a version B n of Brownian motion by M n (t) = B° (t) + ijt, for t £ [0, 1]. 
Then 

Z nil (h)=m 1 n / e {E mn [F n (t + m^ 1 / 3 h)]-E mn [F n (t )]} 

(2.6) 

= m^ 6 {B m „ [F n (to + m-^h)} - B m „ [F n (t )]} + R„(/i), 

where 

|R n (/i)| < 2m?/ 6 sup |E m „(i) -B^(t)| 

0<t<l 

+ m 1 n / 6 \ V \\F n (t + m-^h) - F n (t )\ -> 
uniformly on compacta w.p. 1 using (2.3) and (2.5). Let 

X n (h) := my 6 {B mn [F„(t + m'^h)} - B m „ [F n (t )]} 

and observe that X n is a mean zero Gaussian process defined on I mn with 
independent increments and covariance kernel 

K n {h x M) = m\l z \F n [ta + signal }™n 1/3 (N A \h 2 \)} - F n (t Q )\l{h x h 2 > 0}. 

It now follows from Theorem V.19 in Pollard (1984) and (2.3) that X n (/i) 
converges in distribution to Wi[/(*o)^] in D([— c, c]) for every c > 0, estab- 
lishing the first assertion of the proposition. The second then follows from 
Slutsky's theorem. □ 

2.2. Convergence of A n . Unfortunately, A n is not quite a continuous 
functional of 7L n . If /:/—)• R, write f\J to denote the restriction of / to 
J C J; and if I and J are intervals and / is bounded, write Ljf for the least 
concave majorant of the restriction. Thus, F n = Lr 0iOO )F n in the Introduction. 

Lemma 2.2. Let I be a closed interval; let f:I—> R be a bounded upper 
semi- continuous function on I; and let 01,02,61,62 6 I with b± < a\ < a 2 < 
b 2 . If 2/[i(oi + h)] > Li f {(h) + L I f(b i ),i = 1,2, then = % )6a ]/(a0 

/or ai < x < 02. 

Proof. This follows from the proof of Lemmas 5.1 and 5.2 of Wang and 
Woodroofe (2007). In that lemma continuity was assumed, but only upper 
semi-continuity was used in the (short) proof. □ 

Recall Marshall's lemma: if / is an interval, / : I — > R is bounded, and 
g:I — > R is concave, then \\Lif — g\\ < \\f — g\\. See, for example, Robert- 
son, Wright and Dykstra [(1988), page 329] for a proof. Write F n ,m„ = 
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Lemma 2.3. If 5 > is so small that F is strictly concave on [to — 
25, t + 25] and (2.1) holds then F n ^ mn = L[ to _ 2S: t +2S]^n,m n on [t - 6,t + 5] 
for all large n w.p. 1. 

PROOF. Since F is strictly concave on [to - 25, t + 25],2F(t ± §<5) > 
F(t ±5) + F(t ± 25). Then 

||7? _ pi I < ||F — F\\ 

II- 1 n,m„ - 1 || _i W"- n,m n ± || 

<— = ||E mn || + ||F n -F||->0 w.p. 1 



by Marshall's lemma, (2.1) and the Glivenko-Cantelli theorem. Thus, 
2F nim , n (i ± §5) > F n ^ mn (to ± <5) +-F , n,m„(io ± 25), for all large n w.p. 1, and 
Lemma 2.3 follows from Lemma 2.2. □ 



Proposition 2.4. (i) Suppose that (2.1) and (2.3) hold and given 7 > 0, 
there are < 5 < 1 and C > /or which 

(2.7) |F n (t + ft) - F n (t„) - / n (t )ft - \f'{to)h 2 \ < 7 /i 2 + Cm; 2 ' 3 
and 

(2.8) |F n (t + fr) ~ F„(i )| < C(\h\ + m" 1 / 3 ) 

/or \h\ < <5 and /or all large n. If J is a compact interval and e > 0, then 
there is a compact K ~D J, depending only on e,J,C,j, and 5, for which 

(2.9) P[Li mn Z n = L K Z n on J] > 1 - e 
for all large n. 

(ii) LetY be an a.s. continuous stochastic process on R that is a.s. bounded 
above. If lim^^^Y^)/ \h\ = —00 a.e., then the compact K D J can be cho- 
sen so that 

(2.10) P[L W Y = L K Y on J] > 1 -e. 

Proof. For a fixed sequence (F n = F) (2.9) would follow from the as- 
sertion in Example 6.5 of Kim and Pollard (1990), and it is possible to adapt 
their argument to a triangular array using (2.7) and (2.8) in place of Taylor 
series expansion. A different proof is presented in the Appendix. □ 

We will use the following easily verified fact. In its statement, the metric 
space X is to be endowed with the projection <7-field. See Pollard (1984), 
page 70. 
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Lemma 2.5. Let {X ntC },{Y n },{W c } and Y be sets of random elements 
taking values in a metric space {X,d), n = 0, 1, . . . , and c S K. If for any 
5>0, 

(i) lim c _ > . 00 limsup n ^. 00 P{d(X njC , Y n ) > 5} = 0, 

(ii) \im c ^ oo P{d(W c ,Y)>S} = 0, 

(iii) X njC =>• W c as n — >■ oo /or even/ cGl, 

i/ien y n =^ y as n — > oo . 

Corollary 2.6. If (2.9) and (2.10) hold, and Z n => Y, then L Imn Z n => 
L M Y in D(R) and A n => (L K Y)'(0). 

Proof. It suffices to show that L/ mn Z n | J LrY| J in D(J), for every 
compact interval JCI. Given J and e > 0, there exists a compact, 
5 such that (2.9) and (2.10) hold. This verifies (i) and (ii) of Lemma 
2.5 with c = l/e, X nyC = L Ke Z n , Y n = L Im JL n , W c = L Ke Y, Y = L K Y and 
d(x,y) = sup teJ \x(t) — y(t)\. Clearly, Lfif e Z n | J =>■ Lr- £ Y| J in D(J), by the 
Continuous Mapping theorem, verifying condition (iii). Thus, Li m JL n => 
LrY in Z)(K). Another application of the Continuous Mapping theorem [via 
the lemma on page 330 of Robertson, Wright and Dykstra (1988)] in conjunc- 
tion with (2.9), (2.10) and Lemma 2.5 then shows that A n = (Lj m Z n )'(0) => 
(LrY)'(O). □ 

Corollary 2.7. If (2.1), (2.3), (2.4), (2.7) and (2.8) hold and 

lim 7L{K)l\h\ = — oo, 
\h\— >oo 

t/ien L/ mw Z n => L K Z in D(R) and A n => (L K Z)'(0); and ifZ 2 (h) = f'(t )h 2 /2, 
then A n =^> 2|^/(to)/'(io)| 1//3 C, where C /ias Chernoff's distribution. 

Proof. The convergence follows directly from Proposition 2.4 and 
Corollary 2.6. Note that if Z 2 (h) = f'(t )h 2 /2, then (2.9) and (2.10) hold 
and Corollary 2.6 can be applied. That (LrZ)'(O) is distributed as 
2|^/(to)/'(to)| 1/3 C when Z 2 (/i) = f'{t )h 2 /2 follows from elementary prop- 
erties of Brownian motion via the "switching" argument of Groeneboom 
(1985). □ 

2.3. Remarks on the conditions. If F n = F and f n = /, then clearly (2.1), 
(2.3), (2.4), (2.7) and (2.8) all hold with Z 2 (/i) = f'(t )h 2 /2 for some < 
S < 1 and C > f(to — 5) by a Taylor expansion of F and the continuity of / 
and /' around to. 
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Corollary 2.8. If there is a 5 > for which F n has a continuously 
differentiable density f n on [to — 5, to + 5] , and 



(2.111 



lim 

n— >oo 



\F n -F\\+ sup (\f n ( t )-f(t)\ + \ti(t)-f'(t)\) 

\t-t \<5 



0. 



then (2.1), (2.3), (2.4), (2.7) and (2.8) hold with Z 2 (h) = f'(t )h 2 /2, and 
A n ^2|i/(to)/'(to)| 1/3 C 

Proof. The result can be immediately derived from Taylor expansion 
of F n and the continuity of / and /' around to- To illustrate the idea, we 
show that (2.7) holds. Let 7 > be given. Clearly, 

F n (t + h)- F n (t ) - f n (t )h - \h 2 f'{to) 



(2.12) 



<\h 2 sup \f' n {to + s)- f'{to) 

1 \s\<\h\ 



Let 5 > be so small that \f'(t) — /'(io)| < 7 f° r \t ~ to\ < 5, and let no be 
so large that swpu^i <s \f^(t) — f'(t)\ < 7 for n > no- Then the last line in 
(2.12) is at most 7/1 2 for \h\ < S and n > no- □ 



1/3,, p 

± rr 



0, 



Another useful remark, used below, is that if lim n ,_ 5>00 m n 
then (2.1), (2.3) and (2.8) hold. 

In the next three sections, we apply Proposition 2.1 and Corollary 2.6 to 
bootstrap samples drawn from the EDF, its LCM, and smoothed versions 

thereof. Thus, let X U X 2 , . . . ~ d F; let F n be the EDF of X u . . . ,X n ; and let 
F n be its LCM. If F n =F n , then (2.1), (2.3) and (2.8) hold almost surely by 
the above remark, since 



(2.13) 



|F r; 



O 



log log(ra) 



n 



a.s. 



by the Law of the Iterated Logarithm for the EDF, which may be deduced 
from Hungarian Embedding; and the same is true if F n = F n since \\F n — 
F\\ < ||F n — F\\, by Marshall's lemma. 

If m n = n and f n = fn, then (2.4) is not satisfied almost surely or in prob- 
ability by either F n or F n . For either choice, (2.7) is satisfied in probability 
if /* = /■ 



Proposition 2.9. Suppose that m n = n and that f n = f . If F n is either 
the EDF ¥ n or its LCM F n , then for any 7,6 > 0, there are C > and 
< 5 < 1 for which (2.7) holds with probability at least 1 — e for all large n. 

The proof is included in the Appendix. 
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3. Inconsistency and nonconvergence of the bootstrap. We begin with 
a brief discussion of the bootstrap. 

3.1. Generalities. Now, suppose that X\, X2, ■ ■ ■ ~ F are defined on a 
probability space (£l,A,P). Write X n = (X\, . . . ,X n ) and suppose that the 
distribution function, H n say, of the random variable i? ra (X n , F) is of inter- 
est. The bootstrap methodology can be broken into three simple steps: 

(i) Construct an estimator F n of F from X n ; 

(ii) let X*, . . . ,X^ n '5? F n be conditionally i.i.d. given X n ; 

(iii) then let X* = (X*[ , . . . , X^ n ) and estimate H n by the conditional 
distribution function of i?* = i?(X*,F n ) given X n ; that is 

H* n (x) = P*{R* n <x}, 

where P*{-} is the conditional probability given the data X n , or equivalently, 
the entire sequence X = (X\,X2, ■■■)■ 

Choices of F n considered below are the EDF F n , its least concave majorant 
F n , and smoothed versions thereof. 

Let d denote the Levy metric or any other metric metrizing weak con- 
vergence of distribution functions. We say that -ff* is weakly, respectively, 

strongly, consistent if d(H n ,H*) — > 0, respectively, d(H n ,H*) — > a.s. If H n 
has a weak limit H, then consistency requires H* to converge weakly to H , 
in probability; and if H is continuous, consistency requires 

swp\H n (x) — H(x)\ — > as ?i — >oo. 

There is also the apparent possibility that H* could converge to a random 
limit; that is, that there is a G : x R — > [0, 1] for which G(co, •) is a distri- 
bution function for each ui G 0, G(-,x) is measurable for each x G R, and 
p 

d(G,H*) — > 0. This possibility is only apparent, however, if F n depends only 
on the order statistics. For if h is a bounded continuous function on R, 
then any limit in probability of j^h(x)H*(uj;dx) must be invariant un- 
der finite permutations of X%,X2, ... up to equivalence, and thus, must 
be almost surely constant by the Hewitt-Savage zero-one law [Breiman 
(1968)]. Let G(x) = J^G(uj;x)P(duj). Then G is a distribution function and 
f R h(x)G(ui;dx) = f R h(x)G(dx) a.s. for each bounded continuous h, and 
therefore for any countable collection of bounded continuous h. It follows 
that G(lo; x) = G{x) a.e. to for all x by letting h approach indicator functions. 
Now let 

A n = n 1 / 3 {/ n (t )-/(to)} and A* = m l r / 3 {fX mn (t ) - f n (t )}, 
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where f n (to) is an estimate of f(to), for example, f n (to), and /^ m „(io) is 
the Grenander estimator computed from the bootstrap sample X± X^ n . 
Then weak (strong) consistency of the bootstrap means 

(3.1) sup|P*[A;<x]-P[A n <x]|^0 

in probability (almost surely), since the limiting distribution (1.1) of A n is 
continuous. 

3.2. Bootstrapping from the NPMLE F n . Consider now the case in which 
m n = n, F n = F n , and f n (t ) = f n (t ). Let 

Z* (/») := n 2 / 3 {F;(i + n'^h) - F*(t ) - f n (t Q )n~ l ^h} 

for h £ I n = [— n 1//3 £o, oo), where F* is the EDF of the bootstrap sample 
Xi,...,XZ~F n . Then Z* = Z* n ,i + z «,2, where 

(3.2) Z n ,i(h) = n 2/3 {(F; - F n )(t + n~^h) - (F* - F n )(i )}, 

(3.3) Z n , 2 (/i) = n 2 / 3 {F n (t + /m" 1 / 3 ) - F n (t ) - /n(t )™~ 1/3 M- 

Further, let Wi and W2 be two independent two-sided standard Brownian 
motions on R with Wi(0) = W 2 (0) = 0, 

Z 1 (h) = W 1 [f(t )h], 

Zl(h) = W 2 [f{t )h} + \f\t )h\ 

Z 2 (h) = L R Z° 2 (h) - L M Z°(0) - (L R Z° 2 )'(0)h, 

Z = Zi + Z 2 . 

Then A* equals the left derivative at h = of the LCM of Z* . It is first shown 
that Z* converges in distribution to Z but the conditional distributions of 
Z* do not have a limit. The following two lemmas are needed. 

Lemma 3.1. Let W n and W* be random vectors in M. 1 and R fc , respec- 
tively; let Q and Q* denote distributions on the Borel sets of R' and R fc ; 
and let J- n be sigma-fields for which W n is J- n -measurable. If the distribu- 
tion of W n converges to Q and the conditional distribution of W* given 
T n converges in probability to Q* , then the joint distribution of (W n ,W*) 
converges to the product measure Q x Q* . 

Proof. The above lemma can be proved easily using characteristic func- 
tions. Kosorok (2008) includes a detailed proof. □ 

The next lemma uses a special case of the Convergence of Types theo- 
rem [Loeve (1963), page 203]: let V,W,V n be random variables and b n be 
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constants; if V has a nondegenerate distribution, V n ^>V as n — > oo, and 
V n + b n W, then b = lin^^oo b n exists and W has the same distribution 
as V + b. 

Lemma 3.2. Let X* be a bootstrap sample generated from the data X n . 
Let Y n := ^ n (X n ) and Z n := (j) n (X n ,X* n ) where tp n : W 1 -> R and </> n : M 2n -> 
M are measurable functions; and let K n and L n be the conditional distri- 
bution functions of Y n + Z n and Z n given X n; respectively. If there are 

distribution functions K and L for which L is nondegenerate, d(K n ,K) — > 

p p 
and d(L n ,L) — > then there is a random variable Y for which Y n ^tY . 

Proof. If {n^} is any subsequence, then there exists a further subse- 
quence {nfc ; } for which d{K nk ,K) — > a.s. and d(L nfc; ,L) — > a.s. Then 
Y := lim^oo Y nk exists a.s. by the Convergence of Types theorem, applied 
conditionally given X := (X\,X2, ■ ■ ■) with bi = Y nk[ . Note that Y does not 
depend on the subsequence , since two such subsequences can be joined 
to form another subsequence using which we can argue the uniqueness. □ 

Theorem 3.1. (i) The conditional distribution ofZ^i given X = (X\, 
X2,...) converges a.s. to the distribution 0/Z1. 

(ii) The unconditional distribution of TL n ^ converges to that of Z2 and 
the unconditional distributions of (Z* 1 ,Z.„ j 2), and Z* converge to those of 
(Zi, Z2) and Z. 

(hi) The unconditional distribution o/A* converges to that 0/ (LrZ)'(O), 
and (3.1) fails. 

(iv) Conditional on X, the distribution ofT,* n does not have a weak limit 
in probability. 

(v) If the conditional distribution function of A* converges in probabil- 
ity, then (LrZ)'(O) and Z2 must be independent. 

Proof, (i) The conditional convergence of Z* 1 follows from Proposition 
2.1 with m n = n, F n = F n , ^ n ,m n = ^* n -, applied conditionally given X. It is 
only necessary to show that (2.3) holds a.s., and this follows from the Law of 
the Iterated Logarithm for F n and Marshall's lemma, as explained in Section 
2.3. The unconditional limiting distribution of Z* 1 must also be that of Zi. 

(ii) Let 

AM = n 2/3 [F n (t + n~ 1/3 /i) - F n (io) - f(t )n^ 3 h] 
and observe that 

Z n , 2 (/i) = L In Zl 2 (h) - [L/ n Z° )2 (0) + (L Jn Z°, a )'(0M 
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The unconditional convergence of Z„ 2 an d L/ n Z^ 2 follow from Corollary 
2.7 applied with F n = F, as explained in Section 2.3. The convergence in 
distribution of Z nj 2 now follows from the Continuous Mapping theorem, 
using Lemma 2.5 and arguments similar to those in the proof of Corollary 
2.6. 

It remains to show that Z* 1 and Z^ 2 are asymptotically independent, for 
example, the joint limit distribution of Z* 1 and Z^ 2 is the product of their 
marginal limit distributions. For this, it suffices to show that (Z* . . . , 
Z* ^(ifc)) and (Z^ 2 (sx), • ■ • ,Z° 2 ( s l)) are asymptotically independent, for all 
choices — oo < t± < • • • < tk < oo and — oo < s± < • ■ ■ < si < oo. This is an 
easy consequence of Lemma 3.1 applied with W* = (Z* . . . , Z* ^tk)) 
and W n = (Z° i2 ( Sl ), . . .,lP n ,M), and F n = a(Xi,X 2 , . ..',X n ). 

(hi) We will appeal to Corollary 2.6 to find the unconditional distribu- 
tion of A* . We already know that Z* converges in distribution to Z. That 
(2.10) holds for the limit Z can be directly verified from the definition of 
the process. We only have to show that (2.9) holds unconditionally with 

z„ = z;. 

Let e > and 7 > be given. By Proposition 2.9, there exists 5 > and 
C > such that P(A n ) > 1 - e for all n > iV , where 

A n := {|F n (i + h) + F n (t ) - f(to)h - \f{t Q )h 2 \ < jh 2 + Cn~ 2 / 3 , V|/i| < 5}. 

We can also assume that \F(t + h) + F(t ) - f{t )h - (1/2) f (t )h 2 \ < jh 2 
for \h\ < 5. Let Y* n (h) = n 2 / 3 [F* (t + n" 1 / 3 /i) -F* (t ) - / (t )™- 1/3 h], so that 
Z* (/i) = Y* (/i) - A n /i for all h€l n , and 

L A -Z; = L K Y* n - A n h 

for all h€ K for any interval K C /„. 

Let G n = i^nl^n + -^Ia^ and let denote the probability when gener- 
ating the bootstrap samples from G n . Then G n satisfies (2.1), (2.3), (2.7) 
and (2.8) a.s. with m n = n, F n = G n , V n>mn = F* l An + F n l^ and f n = f. 
Let J be a compact interval. By Proposition 2.4, applied conditionally, there 
exists a compact interval K (not depending on w, by the remark near the 
end of the proof of Proposition 2.4) such that K ~D J and 

Pa n [LiXn = LkK on J](cj) > 1 - £ 

for n > N(co) for a.e. 00. As iV(-) is bounded in probability, there exists 
N\ > such that P(-B) > 1 — e, where B := {co : N(uj) < Ni}. By increasing 
N\ if necessary, let us also suppose that Ni > Nq. Then 

P[L Imn K = L K K on J] = P[L Imn Y* n = L K Y* n on J] 

> f P* [L Imn Y* n = L K Y* n on J] ( w ) dP(w) 
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-5 1 ' 1 1 1 1 1 1 1 1 1 

-5 -4 -3 -2 -1 1 2 3 4 5 

(L R Z)'(O) 

Fig. 1. Scatter plot of 10,000 random draws of ((L K Z)'(0), (-Lj$ZSj)'(0)) when /(i ) = 1 
and /'(t ) = -2. 

= / [£/ mn n = L K Y* on J] ( W ) dP(w) 

> / ^ n = on j] ( U ) dp(u) 

JA n r\B 

> (1 - e) dP{u) > 1 - 3e for all n > JVi 

as P(A n n 5) > 1 - 2e for n>Ni. Thus, (2.9) holds and Corollary 2.6 gives 
A* ^(LrZ)'(O). 

If (3.1) holds in probability, then the unconditional limit distribution of 
A* would be that of 2\^f(to)f'(to)\ 1 ^C, which is different from the distri- 
bution of (LjrZ)'(O), giving rise to a contradiction. 

(iv) We use the method of contradiction. Let Z n := Z* i(ho) and Y n := 

Z n ,2(^o) f° r some fixed ho > (say ho = 1) and suppose that the conditional 

distribution function of Z n + Y n = Z* (ho) converges in probability to the 

distribution function G. By Proposition 2.1, the conditional distribution of 

Z n converges in probability to a normal distribution, which is obviously 

nondegenerate. Thus, the assumptions of Lemma 3.2 are satisfied and we 
p 

conclude that Y n —}Y, for some random variable Y. It then follows from 
the Hewitt-Savage zero-one law that Y is a constant, say Y = cq w.p. 1. 
The contradiction arises since Y n converges in distribution to ^(ho) which 
is not a constant a.s. 

(v) We can show that the (unconditional) joint distribution of (A*,Z° 2 ) 
converges to that of ((LjrZ)'(O), Z 2 ). But A* and Z^ 2 are asymptotically 
independent by Lemma 3.1 applied to W n = (Z° 2 (£i),Z° 2 (*2)j • • • > Z° 2 (i/)), 
where U G R, W* = A* and JF n = a(X 1 ,X 2 , ■ ■ -\x n ). Therefore, (L R Z)'(0) 
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and 7*2 are independent. The proposition follows directly since Z2 is a mea- 
surable function of Z!j. □ 

If the conditional distribution of A* converges in probability, as a con- 
sequence of (v) of Theorem 3.1, (LrZ)'(O) and (L K Z§)'(0) must also be 
independent. Figure 1 shows the scatter plot of (LrZ)'(O) and (LrZ^^O) ob- 
tained from a simulation study with 10,000 samples, f{t$) = 1 and f(to) = 
—2. The correlation coefficient obtained —0.2999 is highly significant {p- 
value < 0.0001). Thus, when combined with simulations, (v) of Theorem 3.1 
strongly suggests that the conditional distribution of A* does not converge 
in probability. 

3.3. Bootstrapping from the EDF. A similar, slightly simpler pattern 
arises if the bootstrap sample is drawn from F n = F n . Define Z* as before, 
and let Z* n>x {h) = n 2 / 3 {(¥* n -F n )(t + n~ 1 / 3 h) - (F* -F n )(t )} and Z n , 2 (/i) = 
n 2 / 3 {F n (to + /in~ 1 /3) _F n (t ) - f n (h)n~ l /^h}. Then Z* = Z^+Z^. Recall 
the definition of the processes Wi, W2, Zi, Z[] in Section 3.2. Define 

Z 2 (h) = Z° 2 (h)-(L R Z° 2 y(0)h. 



Theorem 3.2. (i) The conditional distribution ofL* t x given X = (Xi,X 2 , 
. . .) converges a.s. to the distribution of7L\. 

(ii) The unconditional distribution of Z n> 2 converges to that of TL 2 and 
the unconditional distributions of (Z* x , Z nj2 ), and Z* converge to those of 
(Zi, Z2) and Z. 

(iii) The unconditional distribution o/A* converges to that of (LirZ)'(O), 
and (3.1) fails. 

(iv) Conditional on X, the distribution ofT,* n does not have a weak limit 
in probability. 

(v) If the conditional distribution function of A* converges in probabil- 
ity, then (LrZ)'(O) and Z 2 must be independent. 

Remark. The proof of this theorem runs along similar lines to that of 
Theorem 3.1. We briefly highlight the differences. 

(i) The conditional convergence of Z* 1 follows from Proposition 2.1 
with m n = n, F n = F n , F nj?nn = F* , applied conditionally given X. It is only 
necessary to show that (2.3) is satisfied almost surely, and this follows from 
the Law of the Iterated Logarithm for F n , as explained in Section 2.3. Then 
the unconditional limiting distribution of Z* 1 must also be that of Zi. 

(ii) The proof is similar to that of (ii) of Theorem 3.1, except that now 
Z^ 2 (h)=Z^(h)-(L In Z° nt2 )'(0)h. 

The proofs of (iii)-(v) are very similar to that of (iii)-(v) of Theorem 3.1. 
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Fig. 2. Histograms of the exact distribution of A n (left panel) and the two bootstrap 
distributions while drawing bootstrap samples from F n (middle panel) and F n (right panel) 
for n = 500 . 

3.4. Performance of the bootstrap methods in finite samples. In this sub- 
section, we illustrate the poor finite sample performance of the two incon- 
sistent bootstrap schemes, namely, bootstrapping from the EDF F n and the 
NPMLE F n . Table 1 shows the estimated coverage probabilities of nomi- 
nal 95% confidence intervals for /(l) using the two bootstrap methods for 
different sample sizes, when the true distribution is assumed to be Exponen- 
tial^) and |Normal(0, 1)|, respectively. We used 1000 bootstrap samples to 
compute each confidence interval and then constructed 1000 such confidence 
intervals to estimate the actual coverage probabilities. As is clear from the 
table the coverage probabilities fall well short of the nominal 0.95 value. 
Leger and MacGibbon (2006) also illustrate such a discrepancy in the nomi- 
nal and actual coverage probabilities while bootstrapping from the EDF for 
the Chernoff 's estimator of the mode. 

Figure 2 shows the histograms (computed from 10,000 bootstrap sam- 
ples) of the two inconsistent bootstrap distributions obtained from a single 
sample of 500 Exponential(l) random variables along with the histogram 
of the exact distribution of A n (obtained from simulation). The bootstrap 
distributions are skewed and have very different shapes and supports com- 
pared to that on the left panel of Figure 2. The histograms illustrate the 
inconsistency of the bootstrap procedures. 

Table 1 

Estimated coverage probabilities of nominal 95% confidence intervals for /(l) while 
bootstrapping from the EDF F n and NPMLE F n , with varying sample size n for the two 
models: Exponential 1) (left) and \Z\ where Z ~ Normal(0, 1) (right) 



n 


EDF 


NPMLE 


n 


EDF 


NPMLE 


50 


0.747 


0.720 


50 


0.761 


0.739 


100 


0.776 


0.755 


100 


0.778 


0.757 


200 


0.802 


0.780 


200 


0.780 


0.762 


500 


0.832 


0.797 


500 


0.788 


0.755 
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The estimated coverage probabilities in Table 1 are unconditional [see (iii) 
of Theorems 3.1 and 3.2] and do not provide direct evidence to suggest that 
the conditional distribution of A* does not converge in probability. Figure 
3 shows the estimated 0.95 quantile of the bootstrap distribution for two 
independent data sequences as the sample size increases from 500 to 10,000, 
for the two bootstrap procedures, and for both the models (exponential 
and normal). The bootstrap quantile fluctuates enormously even at very 
large sample sizes and shows signs of nonconvergence. If the bootstrap were 
consistent, the estimated quantiles should converge to 0.6887 (0.8269), the 
0.95 quantile of the limit distribution of A n , indicated by the solid line 
in Figure 3. From the left panel of Figure 3, we see that the estimated 
bootstrap 0.95 quantiles (obtained from the two procedures) for one data 
sequence stays below 0.6887, while for the other, the 0.95 quantiles stay 
above 0.6887, indicating the strong dependence on the sample path. Note 
that if the bootstrap distributions had a limit, then Figure 3 suggests that 
the limit varies with the sample path, and that is impossible as explained in 
Section 3.1. This provides evidence for the nonconvergence of the bootstrap 
estimator. 

4. Consistent bootstrap methods. The main reason for the inconsistency 
of bootstrap methods discussed in the previous section is the lack of smooth- 
ness of the distribution function from which the bootstrap samples are gen- 
erated. The EDF F n does not have a density, and F n does not have a dif- 
ferentiate density, whereas F is assumed to have a nonzero differentiable 
density at to- At a more technical level, the lack of smoothness manifests 
itself through the failure of (2.4). 

The results from Section 2 may be directly applied to derive sufficient 
conditions on the smoothness of the distribution from which the bootstrap 




1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 



Fig. 3. Estimated 0.95 quantile of the bootstrap distribution while generating the boot- 
strap samples from F n (dashed lines) and F n (solid-dotted lines) for two independent data 
sequences along with the 0.95 quantile of the limit distribution of A n (solid line) for the 
two models: Exponential(l) (left panel) and \Z\ where Z ~ Normal(0, 1) (right panel). 
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samples are generated. Let X±, X2, ■ ■ ■ ~* F; let F n be an estimate of F 
computed from X±,... ,X n ; and let /„ be the density of F n or a surrogate, 
as in Section 3. 

Theorem 4.1. If (2.1), (2.3), (2.4), (2.7) and (2.8) hold a.s. with 
F n = F n and f n = f n , then the bootstrap estimate is strongly consistent, for 
example, (3.1) holds w.p. 1. In particular, the bootstrap estimate is strongly 
consistent if there is a 5 > for which F n has a continuously differ entiable 
density f n on [tQ — 5, to + 5], and (2.11) holds a.s. with F n = F n and f n = fn- 

Proof. That A* converges weakly to the distribution on the right- 
hand side of (1.1) a.s. follows from Corollary 2.7 applied conditionally given 
X with F n = F n and f n = f n - The second assertion follows similarly from 
Corollary 2.8. □ 

4.1. Smoothing F n . We show that generating bootstrap samples from a 
suitably smoothed version of F n leads to a consistent bootstrap procedure. 
To avoid boundary effects and ensure that the smoothed version has a de- 
creasing density on (0,oo), we use a logarithmic transformation. Let K be 
a twice continuously differentiable symmetric density for which 



v ' z ' dz < oo 



K h (x,u) = —K 
hx 

(4.2) 



/oo 
[K(z) + \K'{z)\ + \K"{z)\]e< 
-oo 

for some r\ > 0. Let 

— log ( — ] and 

h \x J 

/•oo 

fn(x)= I K h (x,u)f n (u)du= K h (l,u)f n (xu)du. 
Jo Jo 

Thus, e y f n (e v ) = f^ oo h^ 1 K[h~ 1 (y — z)]f n (e z )e z dz. Integrating and using 
capital letters to denote distribution functions, 

F n (ey)= f f n (e s )e s ds 

J — oo 

K(z)F n (e y - hz )dz. 

Alternatively, integrating (4.2) by parts yields 

f°° d 

f n (x) = - J —K h (x,u)F n (u)du. 



oo 
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The proof of (3.1) requires showing that F n and its derivatives are sufficiently 
close to those of F, and it is convenient to separate the estimation error 
F n — F into sampling and approximation error. Thus, let 

/oo 
K{z)F{e y ~ hz )dz. 
-oo 

We denote the first and second derivatives of F^ by fh and f' h , respectively. 
Recall that F is assumed to have a nonincreasing density on (0,oo) that is 
continuously differentiable near to- 

Lemma 4.1. lim^oH^/i — F\\ = 0, and there is a 5 > for which 

(4.4) lim sup [\h(x)-f(x)\ + \r h (x)-f'(x)\]=0. 

h -*°\x-to\<6 

Proof. First, observe that 

/oo 
K{z){F{e y - hz )-F{e y )]dz 
-oo 

by (4.3). That lim/^o Fh( x ) = F(x) for all x > follows easily from the Dom- 
inated Convergence theorem, and uniform convergence then follows from 
Polya's theorem. This establishes the first assertion of the lemma. Next, 
consider (4.4). Given to > 0, let yo — log (to) and let 5 > be so small that 
e y f(e y ) is continuously differentiable (in y) on [yo — 25, yo + 25]. Then 



f h (x)-f(x)= / K(z)[f(xe hz )-f(x)]e hz dz 

J — oo 

/oo 
(e hz -l)K(z)dz 
-oo 

and thus 

/oo 
sup \f(xe hz ) - f(x)\e hz K(z) dz + 0{h 2 ) 
-oo \x—to\<5 

for any < 5 < to- For sufficiently small 5, the integrand approach zero 
as h — > 0; and it is bounded by sup| a ,_ io | <( 5(e _ ' lz /x + f(x))e hz K(z), since 
f(x) <l/x for all x > 0. So the right-hand side approaches zero as h — > by 
the Dominated Convergence theorem. That sup| x _ 4o | <(5 |/^(a;) — f'(x)\ — > 
may be established similarly. □ 

Theorem 4.2. Let K be a twice continuously differentiable, symmetric 
density for which (4-1) holds. If 



h = h n — > and h n . / - — - — — — — > oo 



7) 



log log(ra) 

then the bootstrap estimator is strongly consistent; that is, (3.1) holds a.s. 
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Proof. By Theorem 4.1, it suffices to show that (2.11) holds a.s. with 
F n = F n and f n = f n ; and this would follow from 

\\F n -F h \\ + sup [\f n (x)-f h (x)\ + \f^x)-f' h (x)\}^0 a.s. 

\x—to\<S 

for some 5 > and Lemma 4.1. Clearly, using (4.3), 

(4.5) F n ( e y)-F h ( e y) = ~ J~[F n (J)-F(J)]K(Vjl\dt 

for all y, so that 

||F n -F h \\< \\F n - F\\ < ||F n - F|| = 0[Vloglog(n)/n] a.s. 

by Marshall's lemma and the Law of the Iterated Logarithm. Differentiating 
(4.5) gives 

/»(e») - fh(e y ) = ^ / " ^(^)l^ (^) *■ 

Differentiating (4.5) again and then taking absolute values and considering 
< h < 1, we get 

sup {|/ n (x)-A(rE)| + |/;(x)-/^(x)|} 
|x— to|<<5 

% sup / |F n (e*)-F(e*)| 

" |x-t |<(5 J-oo 

<jpp n -F\\J [\K'{z)\ + \K"(z)\]dz^0 a.s. 

for a constant M > 0, as h^Wn/log log(n) — >• oo, where Marshall's lemma 
and the Law of Iterated Logarithm have been used again. □ 

4.2. m om£ of n bootstrap. In Section 3, we showed that the two most 
intuitive methods of bootstrapping are inconsistent. In this section, we show 
that the corresponding m out of n bootstrap procedures are weakly consis- 
tent. 

Theorem 4.3. If F n = ¥ n , f n = f n , and m n = o(n) then the bootstrap 
procedure is weakly consistent, for example, (3.1) holds in probability. 

Proof. Conditions (2.1), (2.3) and (2.8) hold a.s. from (2.13), as ex- 
plained in Section 2.3. To verify (2.7), let 7 > be given. From the proof 
of Proposition 2.4 [also see Kim and Pollard (1990), page 218], there exists 
5 > such that |F n (t + h) - F n (t ) - F(t + h) - F(t )\ < 7/1 2 + C n n~ 2 / 3 , 



K' 



log x — t 
h 



+ 



K 



log x — t 



dt 
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for \h\ < 5, where C n 's are random variables of order Op{l). We can also as- 
sume that \F(t + h) + F(t )- f(t )h- (1/2) f'(t )h 2 \ < (l/2) 7 /i 2 for \h\<6. 
Then, using the inequality 2|o6| < 7a 2 + 6 2 /7, 

1 

— i 
2 

1 



¥ n (t + h)- F n (to) - hf n (to) - -h 2 f'(t ] 



< 

(4.6) 



¥ n (t + h)- F n (t ) - hf(t ) - ^h 2 f'(t ) 



+ \h\\f n (to)-f(to)\ 



< iyh 2 +C n n- 2 ^ + ^h 2 \ + [\lh 2 + ±\f n (to) ~ /(to)| 2 } 

< 2 7 /i 2 + C n n" 2 / 3 + P (n" 2 / 3 ) < 2 7 /i 2 + o p (to- 2/3 ). 
For (2.4), write 

m 2 / 3 {F n (t + m~ l ' z h) - F n (i ) - m" 1 / 3 /^)/*} 
= m 2 / 3 {(F n - F)(t + m" 1 ^) - (F n - F)(t )} 

(4.7) 

+ m^ 3 [/(t ) - / n (t )]/i + ±/%)/i 2 + o(l) 
4 i/'(to)^ 2 

uniformly on compacts using Hungarian Embedding to bound the second 
line and (1.1) (and a two-term Taylor expansion) in the third. 

Given any subsequence {n^} C N, there exists a further subsequence {n^} 
such that (4.6) and (4.7) hold a.s. and Theorem 4.1 is applicable. Thus, 
(3.1) holds for the subsequence {n^}, thereby showing that (3.1) holds in 
probability. □ 

Next consider bootstrapping from F n . We will assume slightly stronger 
conditions on F, namely, conditions (a)-(d) mentioned in Theorem 7.2.3 of 
Robertson, Wright and Dykstra (1988): 

(a) ai(F) = 'mf{x:F(x) = 1} < 00, 

(b) F is twice continuously differentiable on (0,a\(F)), 

{C) - inf 0<x<ai{F) p(x) <00 > 

(d) /?(F)=inf 0<x<ai(F) |^|>0. 

Theorem 4.4. Suppose that (a)-(d) hold. If F n = F n , f n = f n , and 
m n = o[n(logn)~ 3 / 2 ] then (3.1) holds in probability. 

Proof. Conditions (2.1), (2.3) and (2.8) again follow from (2.13), as 
explained in Section 2.3. The verification of (2.7) is similar to the argument 
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in the proof of Theorem 4.3. We show that (2.4) holds. Adding and sub- 

2/3 1/3 

tracting m n [F n (£o + Tn n h) — F n (io)] from Z n> 2(/i) and using (4.7) and 
the result of Kiefer and Wolfowitz (1976) 



sup 

\h\<c 



<2m2/ 3 ||F n -F n ||+op(l) 



<2m2/ 3 ||F n -F n ||+ 0p (l) 

= P [m 2 T / 3 n- 2 / 3 log(n)]+op(l) 

for any c > from which (2.4) follows easily. □ 

5. Discussion. We have shown that bootstrap estimators are inconsis- 
tent when bootstrap samples are drawn from either the EDF ¥ n or its least 
concave major ant F n but consistent when the bootstrap samples are drawn 
from a smoothed version of F n or an m out of n bootstrap is used. We 
have also derived necessary conditions for the bootstrap estimator to have 
a conditional weak limit, when bootstrapping from either F n or F n and pre- 
sented compelling numerical evidence that these conditions are not satisfied. 
While these results have been obtained for the Grenander estimator, our re- 
sults and findings have broader implications for the (in)-consistency of the 
bootstrap methods in problems with an n 1//3 convergence rate. 

To illustrate the broader implications, we contrast our finding with those 
of Abrevaya and Huang (2005), who considered a more general framework, 
as in Kim and Pollard (1990). For simplicity, we use the same notation as in 
Abrevaya and Huang (2005). Let W n := r n (9 n — do) and W n := r n (6 n — 9 n ) 
be the sample and bootstrap statistics of interest. In our case r n = n 1 / 3 , 
$o = f(to), &n = fn(to) and 6 n = fn(to)- When specialized to the Grenander 
estimator, Theorem 2 of Abrevaya and Huang (2005) would imply [by cal- 
culations similar to those in their Theorem 5 for the NPMLE in a binary 
choice model] that 

W n =>■ arg max Z(t) — arg max Z(t) 

conditional on the original sample, in P°°-probability, where Z(t) = W(t) — 
ct 2 and Z(t) = W(t) + W(t) — ct 2 , W and W are two independent two sided 
Brownian motions on R with W(0) = W'(O) = and c is a positive constant 
depending on F. We also know that W n => argmaxZ(t) unconditionally. By 
(v) of Theorem 3.1, this would force the independence of argmaxZ(f) and 
argmaxZ(£) — argmaxZ(t); but, there is overwhelming numerical evidence 
that these random variables are correlated. 
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Lemma A. 1. Let^-.R- 

h G R, /or some M > 0, and 

(A.l) 



APPENDIX 

R 6e a function such that ^(h) < M for all 



lim 



-oo. 



T/ien /or any 6 > 0, i/iere exists cq > b such that for any c > cq, L^^/(h) = 
L { _ cA V{h) for all\h\<b. 

PROOF. Note that for any c > 0, L^(h) > Ly_ c c ^{h) for all h G [— c, c]. 
Given 6 > 0, consider c > 6 and $> c {h) = Li cc ^(h) for /i G [—6,6], and let 
$ c be the linear extension of Ly_ c ^\y_ b ^ outside [—6,6]. We will show 
that there exists cq > 6 + 1 such that <£ Co > ^. Then <3> Co will be a concave 
function everywhere greater than $, and thus $ co > Lir 1 !/. Hence, Lk^(/i) < 
$ co (/i) = Lr cq cq]^^) for h G [—6,6], yielding the desired result. 



For any c > 6+ 1, 3> c (/i) 
the min-max formula, 



<S> c (b) - & c (b) + & c (b)(h-b + l) for /i > 6. Using 



$1(6) = min max 

-c<s<6fe<t<c 



¥(*)-¥(*) 



> min 

-c<s<fe 

> ^(6+1) 



(6+l)-s 
M =: S < 0. 



Thus, 



* e (h)=^b)-& c (b) + & c (b)(h-b+l) 
>{*(b)-$' c (b)} + *' e (b)(h-b+l) 
> *(b) + (h-b)B 

for h>b + l. Observe that Bq does not depend on c. Combining this with a 
similar calculation for h < — (6 + 1), there are Kq > and K\ > 0, depending 
only on 6, for which $ c (/i) > Kq — K\\h\ for \h\ > 6+ 1. From (A.l), there 
is Co > 6 + 1 for which ^f(h) < -Ko — for all |/i| > Co in which case 

*(/») < $c„(/i) for all /i. It follows that L R $ < $ C() (/i) for < 6. □ 

Lemma A. 2. Lei B 6e a standard Brownian motion. If a, 6, c> 0, a 3 6 = 
1, i/ien 



(A.2) 



P 



it 
teR a + bt z 



P 



sup 



1 + s 2 



> c 
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Proof. This follows directly from rescaling properties of Brownian mo- 
tion by letting t = a 2 s. □ 

Proof of Proposition 2.4. Let J = [01,02] and e > be as in the 
statement of the proposition; let 7= |/'(io)|/16; and recall (2.5) and (2.6) 
from the proof of Proposition 2.1. Then there exists < 8 < 1, C > 1, and 
no > 1 for which (2.7) and (2.8) hold for all n > uq. Let /* := [— Smj/ 3 , 5ml/ 3 ] 
By making 5 smaller, if necessary, and using Lemma 2.3, Lj mn 7j n (h) = 

Lj* 7L n {K) for \h\ < 5ml/ 3 j 2 for all but a finite number of n w.p. 1. By 
increasing the values of C and no, if necessary, we may suppose that the 
right-hand side of (A. 2) (with c = C) is less than e/3, that P[\r]\ > C] + 

Pfsupo^imi^lE^Ct) - B°Jt)| >C}< e/3, and that L lm Z n = L^l* 

on [— ^5ml/ 3 , ^5ml/ 3 ] with probability at least 1 — e/3 for all n > uq. We can 

also assume that a := 8C 3 /7 > 1. Then, using Lemma A. 2 with a = am n 
and b = a~ 3 , the following relations hold simultaneously with probability at 
least 1 — e for n > no: 

\M mn [F n (t ) + s] - M mn [F n (t ))\ < Ciam- 1 ' 6 + a~ 3 ^s 2 ) for all s, 

L Imn Z n = L I * nn Z n on 

and 

sup m l J^ mn {t)-Ml n (t)\<C. 
o<t<\ 

Let B n be the event that these four conditions hold. Then P{B n ) > 1 — e 
for n > no, and from (2.6), B n implies 

|Zn,l(/»)| < C{a + a-\i 2 r ! 3 [F n (t + m" 1 ^) - F n (t )} 2 } + 2C 

(A.3) + Cml/ 6 \F n (t + m" 1/3 /i) - F n (t )| 

< 4C{a + a" 1 ™ 2 / 3 ^^ + m" 1/3 /i) - i^o)] 2 } 

using the inequalities \F n (to + m n l ^ 3 h) — F n (to)\ < am n 1 ^ + a~ l ml/^[F n (to + 
m n l l 3 h) — F n (to)] 2 and a > 1. For sufficiently large n, using (2.8), we have 

\Zn,l(h)\ < 4C[a + a" 1 C 2 7n2/ 3 (m^ 1 / 3 |/ l | + m' 1 ' 3 ) 2 ] 

(A.4) <4C[a + 2a- 1 C 2 (/i 2 + l)] 

= 7 /i 2 + c 

for \h\ < ^m^ 3 with C = 4Ca + 8C 3 a _1 . Also, we can show that \L n ^{h) — 

f'(t )h 2 /2\ <jh 2 + C for all \h\ < 5ml/ 3 by (2.7). Let b 2 > a 2 be such that 
-5 7 (a 2 + b 2 ) 2 + 6 7 (a| + 6|) - 8C > 0. 



i m V3 * m i/3 

2 n ' 2 n 



<C, 
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Recalling that 7 = — /'(to)/16, B n implies 

-IO7/1 2 - 2C < Z n (h) = Z nA (h) + Zn )2 (h) < -67/1 2 + 2C 

1 /3 

for < <5m n and sufficiently large n. Since the right-hand side is concave, 
B n also implies Lj* Z n (/i) < — 67/1 2 + 2C for \h\ < (Jm^ 3 . Therefore, for 
sufficiently large n, using the upper bound on Lj* Z n , the lower bound on 

1/3 

Z n obtained above, and Lj mn Z, n (h) = Lj*^ Z n (/i) for < <5m n /2 on I? n , 
and [a 2 , 62] C , we have 

2Z n (^±^j - [L Imn Z n {a 2 ) + L Im Z n (b 2 )} 

> -5 7 (a 2 + 6 2 ) 2 + 6 7 (a 2 + 6 2 ,) - 8C > 

with probability at least 1 — e. Thus, _B n implies 27j n [^(a 2 + b 2 )] > Lj mn Z n (a 2 ) + 
L/ m Z n (b 2 ) with probability at least 1 — e. Similarly, U n implies that there 
is a 61 < ai for which 2Z n [|(ai +61)] > L/ mn Z n (ai) + L/ mn Z n (6i) with prob- 
ability at least 1 — e. Relation (2.9) then follows from Lemma 2.2. It is worth 
noting as a remark that 61,62 do not depend on the sequence F n . 

Next, consider (2.10). Given a compact J = [—6,6], let cq{u) be the small- 
est positive integer such that for any c > Co, LrZ(/i) = L^_ C ( ^j{K) for h& J. 
That Co exists and is finite w.p. 1 follows from Lemma A.l. Defining W c := 
L[„ C C ]Z and Y = LrZ, the event {W c 7^ Y on J} C {c D > c}. Now given any 
e > 0, there exist c such that P[c Q < c] > 1 — e. Therefore, 

P[L R Z = L[_ CjC] Z on J] > P[c Q < c] > 1 - e. □ 

Proof of Proposition 2.9. First, consider F n . Let < 7 < |/'(t )|/2 
be given. There is a < 5 < ^to such that 

(A.5) \F(to + h)- F(t ) - f(t Q )h - \f(t )h 2 \ < \^h 2 

for \h\ < 25. From the proof of Proposition 2.4, using arguments similar to 
deriving (A. 3) and (A. 4), we can show that 

|(F n - F)(t + h)- (F n - F)(to)\ < \-yh 2 + Cn- 2 / 3 

for \h\ < 25 with probability at least 1 — e for sufficiently large n. Therefore, 
by adding and subtracting F(to + h) — F(to) and using (A.5), 

(A.6) |F n (t + h)- F n (to) - /(to)/» - y\t )h 2 \ < 7/1 2 + Cn~ 2 / 3 

for |/t| < 25 with probability at least 1 — e for large n. 

Next, consider F n . Let S n denote the event that (A.6) holds. Then P(B n ) 
is eventually larger than 1 — e and on B n , we have 

¥ n (t +h)- F n (t ) - /(t )/i < {7 - |l/'(io)|}/i 2 + Cn" 2 / 3 
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for \h\ < 25. Let E n be the event that F n {h) = L[ to _ 2 sj +28]^n(h) for h £ 
[to — 5, to + 5]. Then by Lemma 2.3, P(E n ) > 1 — e, for all sufficiently large 
n. Taking concave majorants on either side of the above display for \h\ <25 
and noting that the right-hand side of the display is already concave, we 
have: F n (t + h) - ¥ n (t ) - f(t )h < { 7 - \\f'{t Q )\}h 2 + Cn~ 2 / 3 , for \h\<8 
on B n n E n . Setting h = shows that on E n n B n , F n (t ) - ¥ n (t ) < Cn~ 2 l z . 
Now, as F n (to) < F n (to), it is also the case that on E n n B n , for \h\ < 5, 

(A.7) F n (t + h)- F n (t ) - f(to)h < {7 - \\f\t )\}h 2 + Cn~ 2 l\ 
Furthermore on E n D B n , 

F n (t + h)- F n (t ) - f(t )h - \f'(t Q )h 2 
(A.8) > F n (i + h)- {F n (t ) + Cn^ 2 / 3 } - f(t )h - \f'{t )h 2 

> -jh 2 - 2Cn~ 2/3 . 
Therefore, combining (A.7) and (A.8), 

\F n (t + h)- F n (t ) - f(t )h - \f{t )h 2 \ < ~fh 2 + 2Cn- 2 l z 
for \h\ < 6 with probability at least 1 — 2e for large n. □ 
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